Goto

Collaborating Authors

 aerial manipulation


A Cross-Embodiment Gripper Benchmark for Rigid-Object Manipulation in Aerial and Industrial Robotics

Vagas, Marek, Varga, Martin, Romancik, Jaroslav, Majercak, Ondrej, Suarez, Alejandro, Ollero, Anibal, Vanderborght, Bram, Virgala, Ivan

arXiv.org Artificial Intelligence

Abstract--Robotic grippers are increasingly deployed across industrial, collaborative, and aerial platforms, where each embodiment imposes distinct mechanical, energetic, and operational constraints. Established YCB and NIST benchmarks quantify grasp success, force, or timing on a single platform, but do not evaluate cross-embodiment transferability or energy-aware performance, capabilities essential for modern mobile and aerial manipulation. This letter introduces the Cross-Embodiment Gripper Benchmark (CEGB), a compact and reproducible benchmarking suite extending YCB and selected NIST metrics with three additional components: a transfer-time benchmark measuring the practical effort required to exchange embodiments, an energy-consumption benchmark evaluating grasping and holding efficiency, and an intent-specific ideal payload assessment reflecting design-dependent operational capability. T ogether, these metrics characterize both grasp performance and the suitability of reusing a single gripper across heterogeneous robotic systems. A lightweight self-locking gripper prototype is implemented as a reference case. Experiments demonstrate rapid embodiment transfer (median 17.6 s across user groups), low holding energy for gripper prototype ( 1.5 J per 10 s), and consistent grasp performance with cycle times of 3.2-3.9 CEGB thus provides a reproducible foundation for cross-platform, energy-aware evaluation of grippers in aerial and manipulators domains. Robotic grasping has been extensively investigated across industrial, collaborative, and aerial domains.


Whole-body motion planning and safety-critical control for aerial manipulation

Yang, Lin, Lee, Jinwoo, Campolo, Domenico, Kim, H. Jin, Byun, Jeonghyun

arXiv.org Artificial Intelligence

Aerial manipulation combines the maneuverability of multirotors with the dexterity of robotic arms to perform complex tasks in cluttered spaces. Yet planning safe, dynamically feasible trajectories remains difficult due to whole-body collision avoidance and the conservativeness of common geometric abstractions such as bounding boxes or ellipsoids. We present a whole-body motion planning and safety-critical control framework for aerial manipulators built on superquadrics (SQs). Using an SQ-plus-proxy representation, we model both the vehicle and obstacles with differentiable, geometry-accurate surfaces. Leveraging this representation, we introduce a maximum-clearance planner that fuses Voronoi diagrams with an equilibrium-manifold formulation to generate smooth, collision-aware trajectories. We further design a safety-critical controller that jointly enforces thrust limits and collision avoidance via high-order control barrier functions. In simulation, our approach outperforms sampling-based planners in cluttered environments, producing faster, safer, and smoother trajectories and exceeding ellipsoid-based baselines in geometric fidelity. Actual experiments on a physical aerial-manipulation platform confirm feasibility and robustness, demonstrating consistent performance across simulation and hardware settings. The video can be found at https://youtu.be/hQYKwrWf1Ak.


AERMANI-VLM: Structured Prompting and Reasoning for Aerial Manipulation with Vision Language Models

Mishra, Sarthak, Yadav, Rishabh Dev, Das, Avirup, Gupta, Saksham, Pan, Wei, Roy, Spandan

arXiv.org Artificial Intelligence

This reasoning-action loop continues until task completion, enabling the VLM to focus on semantic reasoning while delegating precise execution to robust controllers. The framework is evaluated in simulation and real-world experiments using a pretrained VLM, and comprehensive comparison and ablation studies are carried out to verify its performance. CLIPSeg [12] is used for prompt-based segmentation, maintaining a unified prompting pipeline from perception to reasoning. A. Additional Related W orks Aerial manipulation has progressed from vision-guided approaches relying on onboard cameras and artificial visual cues [13], to fully markerless grasping systems using onboard perception [14], and more recently end-effector-centric frameworks for versatile manipulation [15], yet all remain focused on execution rather than language-level reasoning. In parallel, VLAs [2]-[5] combine LLMbased planning [16], [17] with perceptual grounding from models such as CLIP [18], CLIPort [19], and LLaV A [20], but their end-to-end policies are data-intensive and prone to unsafe behaviors from ambiguous outputs, or adversarial prompts, motivating hybrid approaches where reasoning is decoupled from execution via modular skill primitives [21], [22]. For multirotors specifically, foundation model research has focused on mission planning [23], spatial reasoning [24], and direct control [25] which advances locomotion but does not extend to aerial manipulation, and it requires exploration coupled with grasping and placement [26]. In summary, control-focused aerial manipulation, reasoning-focused VLAs, and navigation-focused UA V -VLN each address parts of the problem, but none unify perception, reasoning, and execution for aerial manipulation. Together, these limitations motivate AERMANI-VLM, which unifies open-vocabulary perception, structured reasoning, and safe skill execution for aerial manipulation.


Integration of a Variable Stiffness Link for Long-Reach Aerial Manipulation

Fernandez, Manuel J., Suarez, Alejandro, Ollero, Anibal, Fumagalli, Matteo

arXiv.org Artificial Intelligence

This paper presents the integration of a Variable Stiffness Link (VSL) for long-reach aerial manipulation, enabling adaptable mechanical coupling between an aerial multirotor platform and a dual-arm manipulator. Conventional long-reach manipulation systems rely on rigid or cable connections, which limit precision or transmit disturbances to the aerial vehicle. The proposed VSL introduces an adjustable stiffness mechanism that allows the link to behave either as a flexible rope or as a rigid rod, depending on task requirements. The system is mounted on a quadrotor equipped with the LiCAS dual-arm manipulator and evaluated through teleoperated experiments, involving external disturbances and parcel transportation tasks. Results demonstrate that varying the link stiffness significantly modifies the dynamic interaction between the UAV and the payload. The flexible configuration attenuates external impacts and aerodynamic perturbations, while the rigid configuration improves positional accuracy during manipulation phases. These results confirm that VSL enhances versatility and safety, providing a controllable trade-off between compliance and precision. Future work will focus on autonomous stiffness regulation, multi-rope configurations, cooperative aerial manipulation and user studies to further assess its impact on teleoperated and semi-autonomous aerial tasks.


Design of a Flexible Robot Arm for Safe Aerial Physical Interaction

Mellet, Julien, Berra, Andrea, Seisa, Achilleas Santi, Sankaranarayanan, Viswa, Gamage, Udayanga G. W. K. N., Soto, Miguel Angel Trujillo, Heredia, Guillermo, Nikolakopoulos, George, Lippiello, Vincenzo, Ruggiero, Fabio

arXiv.org Artificial Intelligence

This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.


Evaluation of Human-Robot Interfaces based on 2D/3D Visual and Haptic Feedback for Aerial Manipulation

Mellet, Julien, Allenspach, Mike, Cuniato, Eugenio, Pacchierotti, Claudio, Siegwart, Roland, Tognon, Marco

arXiv.org Artificial Intelligence

Most telemanipulation systems for aerial robots provide the operator with only 2D screen visual information. The lack of richer information about the robot's status and environment can limit human awareness and, in turn, task performance. While the pilot's experience can often compensate for this reduced flow of information, providing richer feedback is expected to reduce the cognitive workload and offer a more intuitive experience overall. This work aims to understand the significance of providing additional pieces of information during aerial telemanipulation, namely (i) 3D immersive visual feedback about the robot's surroundings through mixed reality (MR) and (ii) 3D haptic feedback about the robot interaction with the environment. To do so, we developed a human-robot interface able to provide this information. First, we demonstrate its potential in a real-world manipulation task requiring sub-centimeter-level accuracy. Then, we evaluate the individual effect of MR vision and haptic feedback on both dexterity and workload through a human subjects study involving a virtual block transportation task. Results show that both 3D MR vision and haptic feedback improve the operator's dexterity in the considered teleoperated aerial interaction tasks. Nevertheless, pilot experience remains the most significant factor.


An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Bauer, Erik, Blöchlinger, Marc, Strauch, Pascal, Raayatsanati, Arman, Cavelti, Curdin, Katzschmann, Robert K.

arXiv.org Artificial Intelligence

Aerial manipulation combines the versatility and speed of flying platforms with the functional capabilities of mobile manipulation, which presents significant challenges due to the need for precise localization and control. Traditionally, researchers have relied on offboard perception systems, which are limited to expensive and impractical specially equipped indoor environments. In this work, we introduce a novel platform for autonomous aerial manipulation that exclusively utilizes onboard perception systems. Our platform can perform aerial manipulation in various indoor and outdoor environments without depending on external perception systems. Our experimental results demonstrate the platform's ability to autonomously grasp various objects in diverse settings. This advancement significantly improves the scalability and practicality of aerial manipulation applications by eliminating the need for costly tracking solutions. To accelerate future research, we open source our ROS 2 software stack and custom hardware design, making our contributions accessible to the broader research community.


Vision-assisted Avocado Harvesting with Aerial Bimanual Manipulation

Liu, Zhichao, Zhou, Jingzong, Mucchiani, Caio, Karydis, Konstantinos

arXiv.org Artificial Intelligence

Robotic fruit harvesting holds potential in precision agriculture to improve harvesting efficiency. While ground mobile robots are mostly employed in fruit harvesting, certain crops, like avocado trees, cannot be harvested efficiently from the ground alone. This is because of unstructured ground and planting arrangement and high-to-reach fruits. In such cases, aerial robots integrated with manipulation capabilities can pave new ways in robotic harvesting. This paper outlines the design and implementation of a bimanual UAV that employs visual perception and learning to autonomously detect avocados, reach, and harvest them. The dual-arm system comprises a gripper and a fixer arm, to address a key challenge when harvesting avocados: once grasped, a rotational motion is the most efficient way to detach the avocado from the peduncle; however, the peduncle may store elastic energy preventing the avocado from being harvested. The fixer arm aims to stabilize the peduncle, allowing the gripper arm to harvest. The integrated visual perception process enables the detection of avocados and the determination of their pose; the latter is then used to determine target points for a bimanual manipulation planner. Several experiments are conducted to assess the efficacy of each component, and integrated experiments assess the effectiveness of the system.


Flying Calligrapher: Contact-Aware Motion and Force Planning and Control for Aerial Manipulation

Guo, Xiaofeng, He, Guanqi, Xu, Jiahe, Mousaei, Mohammadreza, Geng, Junyi, Scherer, Sebastian, Shi, Guanya

arXiv.org Artificial Intelligence

Aerial manipulation has gained interest in completing high-altitude tasks that are challenging for human workers, such as contact inspection and defect detection, etc. Previous research has focused on maintaining static contact points or forces. This letter addresses a more general and dynamic task: simultaneously tracking time-varying contact force in the surface normal direction and motion trajectories on tangential surfaces. We propose a pipeline that includes a contact-aware trajectory planner to generate dynamically feasible trajectories, and a hybrid motion-force controller to track such trajectories. We demonstrate the approach in an aerial calligraphy task using a novel sponge pen design as the end-effector, whose stroke width is proportional to the contact force. Additionally, we develop a touchscreen interface for flexible user input. Experiments show our method can effectively draw diverse letters, achieving an IoU of 0.59 and an end-effector position (force) tracking RMSE of 2.9 cm (0.7 N). Website: https://xiaofeng-guo.github.io/flying-calligrapher/


Non-Prehensile Aerial Manipulation using Model-Based Deep Reinforcement Learning

Dimmig, Cora A., Kobilarov, Marin

arXiv.org Artificial Intelligence

With the continual adoption of Uncrewed Aerial Vehicles (UAVs) across a wide-variety of application spaces, robust aerial manipulation remains a key research challenge. Aerial manipulation tasks require interacting with objects in the environment, often without knowing their dynamical properties like mass and friction a priori. Additionally, interacting with these objects can have a significant impact on the control and stability of the vehicle. We investigated an approach for robust control and non-prehensile aerial manipulation in unknown environments. In particular, we use model-based Deep Reinforcement Learning (DRL) to learn a world model of the environment while simultaneously learning a policy for interaction with the environment. We evaluated our approach on a series of push tasks by moving an object between goal locations and demonstrated repeatable behaviors across a range of friction values.